Figure 1 



Heteractis crispa chromoprotein wild type (base isoform) 

10 20 30 40 50 60 

5 ' ACCATTTGCTTTGGTTCCTTGGCAAACGAAAGTTTAGAACGAAAACTGACCCAAATTACA 
70 80 90 100 110 120 

TCTTCCTCCTGGATCCTTACCATGGCTGGTTTGTTGAAAGAAAGTATGCGCATCAAGATG 

MAGLLKESMRIKM 
130 140 150 160 170 180 

TACATGGAAGGCACGGTTAATGGCCATTATTTCAAGTGTGAAGGAGAGGGAGACGGCAAC 
YMEGTVNGHYFKCEGEGDGN 

190 200 210 220 230 240 

CCATTTACAGGTACGCAGAGCATGAGGATTCATGTCACCGAAGGGGCTCCATTACCATTT 
P FTGTQSMRIHVTEGAPL PF 

250 260 270 280 290 300 

GCCTTCGACATTTTGGCACCGTGTTGTGAGTACGGCAGCAGGACCTTTGTCCACCATACG 
AFDI LAPCCEYGSRTFVHHT 

310 320 330 340 350 360 

GCAGAGATTCCCGATTTCTTCAAGCAGTCTTTCCCTGAAGGCTTTACTTGGGAAAGAACC 
AEIPDFFKQSFPEGFTWERT 

370 380 390 400 410 420 

ACAACCTATGAAGATGGAGGCATTCTTACTGCTCATCAGGACACAAGCCTGGAGGGGAAC 
TTYEDGGILTAHQDTSLEGN 

430 440 450 460 470 480 

TGCCTTATATACAAGGTGAAAGTCCTTGGTACCAATTTTCCTGCTGATGGCCCCGTGATG 
CLIYKVKVLGTNFPADGPVM 

490 500 510 520 530 540 

AAGAACAAATCAGGAGGATGGGAGCCATGCACTGAGGTGGTTTATCCAGAGAATGGTGTC 
KNKSGGWEPCTEVVYPENGV 

550 560 570 580 590 600 

CTGTGTGGACGTAATGTGATGGCCCTTAAAGTCGGTGATCGTCGTTTGATCTGCCATCTC 
LCGRNVMALKVGDRRL I CHL 

610 620 630 640 650 660 

TATACTTCTTACAGGTCCAAGAAAGCAGTCCGTGCCTTGACAATGCCAGGATTTCATTTT 
YTSYRSKKAVRALTMPGFHF 

670 680 690 700 710 720 

ACAGACATCCGCCTTCAGATGCCGAGGAAAAAGAAAGACGAGTACTTTGAACTGTACGAA 
TDIRLQMPRKKKDEYFELYE 

730 740 750 760 770 780 

GCATCTGTGGCTAGGTACAGTGATCTTCCTGAAAAAGCAAATTGATTGTTCCCAGTGACA 
ASVARYSDLPEKAN* 

790 800 810 820 830 840 

CCAGACTGCTGTCAGCTTTTGGTTAAAGCCCGAAAGACAAAAGGACATTTGTAGTTTAGT 

850 860 870 880 890 900 

TTATATTTCCCTTTCATTTGTGAATCAACATTGTACTCTCTGTAAACCTTTAAAATGCTC 
910 

CATTAAACCT 3' (SEQ ID NOs : 01 & 02) 



Figure 2 

Heteractis crispa chromoprotein wild type (second isoform) 

10 20 30 40 50 60 

5 ' ACCATTTGCTTTGGTTCCTTGGCAAACGAAAGTTTAGACGAAAACTGACCCAAATTACAT 

70 80 90 100 110 120 

CCTCCTGATCCTTACCATGGCTGGTTTGTTGAAAGAAAGTATGCGCATCAAGATGTACAT 

MAGLLKESMRIKMYM 

130 140 150 160 170 180 

GGAAGGCACGGTTAATGGCCATTATTTCAAGTGTGAAGGAGAGGGAGACGGCAACCCATT 
EGTVNGHYFKCEGEGDGNPF 

190 200 210 220 230 240 

TACAGGTACGCAGAGCATGAGGATTCATGTCACCGAAGGGGCTCCATTACCATTTGCCTT 
TGTQSMR IHVTEGAPLPFAF 

250 260 270 280 290 300 

CGACATTTTGGCACCGTGTTGTGAGTACGGCAGCAGGACCTTTGTCCACCATACGGCAGA 
DILAPCCEYGSRTFVHHTAE 

310 320 330 340 350 360 

GATTCCCGATTTCTTCAAGCAGTCTTTCCCTGAAGGCTTTACTTGGGAAAGAACCACAAC 
I PDFFKQSFPEGFTWERTTT 

370 380 390 400 410 420 

CTATGAAGATGGAGGCATTCTTACTGCTCATCAGGACACAAGCCTGGAGGGGAACTGCCT 
YEDGGILTAHQDTSLEGNCL 

430 440 450 460 470 480 

TATATACAAGGTGAAAGTCCTTGGTACCAATTTTCCTGCTGATGGCCCCGTGATGAAGAA 
IYKVKVLGTNFPADGPVMKN 

490 500 510 520 530 540 

CAAATCAGAAGGATGGGAGCCATGCACTGAGGTGGTTTATCCAGATAATGGTGTCCTGTG 
KSEGWEPCTEVVYPDNGVLC 

550 560 570 580 590 600 

TGGACGTAATGTGATGGCCCTTAAAGTCGGTGATCGTCGTTTGATCTGCCATCTCTATAC 
GRNVMALKVGDRRL I CHLYT 

610 620 630 640 650 660 

TTCTTACAGGTCCAAGAAAGCAGTCCGTGCCTTGACAATGCCAGGATTTCATTTTACAGA 
SYRSKKAVRALTMPGFHFTD 

670 680 690 700 710 720 

CATCCGCCTTCAGATGCCGAGGAAAAAGAAAGACGAGTACTTTGAACTGTACGAAGCATC 
IRLQMPRKKKDEYFELYEAS 

730 740 750 760 770 780 

TGTGGCTAGGTACAGTGATCTTCCTGAAAAAGCAAATTGATTGTTCCCAGTGACACCAGA 
VARYSDLPEKAN* 

790 800 810 820 830 840 

CTGCTGTCAGCTTTTGGTTAAAGCCCGAAAGACAAAAGGACATTTGTAGTTTTAGTTTAT 

850 860 870 880 890 900 

ATTTTCCCTTTCATTTTGTGAATCAACATTGTACTCTCTGTAAACCTTTAAAATGCTCCA 

TTAAACCT 3' <SEQ ID NOs : 03 & 04) 



Figure 3 
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Figure 4 



Heteractis crispa fluorescent protein mutant C148S 

C148S according to GFP numbering 
C143S according to self-numbering. 



ATGGCTGGTTTGTTGAAAGAAAGTATGCGCATCAAGATG 
MAGLLKESMRIKM 

TACATGGAAGGCACGGTTAATGGCCATTATTTCAAGTGTGAAGGAGAGGGAGACGGCAAC 
YMEGTVNGHYFKCEGEGDGN 

CCATTTACAGGTACGCAGAGCATGAGGATTCATGTCACCGAAGGGGCTCCATTACCATTT 
PFTGTQSMRIHVTEGAPLPF 

GCCTTCGACATTTTGGCACCGTGTTGTGAGTACGGCAGCAGGACCTTTGTCCACCATACG 
AFDI LAPCCEYGSRTFVHHT 

GCAGAGATTCCCGATTTCTTCAAGCAGTCTTTCCCTGAAGGCTTTACTTGGGAAAGAACC 
AEIPDFFKQSFPEGFTWERT 

ACAACCTATGAAGATGGAGGCATTCTTACTGCTCATCAGGACACAAGCCTGGAGGGGAAC 
TTYEDGGILTAHQDTSLEGN 

TGCCTTATATACAAGGTGAAAGTCCTTGGTACCAATTTTCCTGCTGATGGCCCCGTGATG 
CLIYKVKVLGTNFPADGPVM 

AAGAACAAATCAGGAGGATGGGAGCCAAGCACTGAGGTGGTTTATCCAGAGAATGGTGTC 
KNKSGGWEPSTEVVYPENGV 

CTGTGTGGACGTAATGTGATGGCCCTTAAAGTCGGTGATCGTCGTTTGATCTGCCATCTC 
LCGRNVMALKVGDRRLICHL 

TATACTTCTTACAGGTCCAAGAAAGCAGTCCGTGCCTTGACAATGCCAGGATTTCATTTT 
YTSYRSKKAVRALTMPGFHF 

ACAGACATCCGCCTTCAGATGCCGAGGAAAAAGAAAGACGAGTACTTTGAACTGTACGAA 
TDIRLQMPRKKKDEYFELYE 

GCATCTGTGGCTAGGTACAGTGATCTTCCTGAAAAAGCAAATTGA 
ASVARYSDLPEKAN* 
(SEQ ID NOS: 05 & 06) 



Figure 5A 




Figure 5B 




Figure 6 



Heteractis crispa fluorescent protein mutant 44-9 

point mutations: A5S/T39A,C148S,L181H,P208L,K211E according to GFP 
numbering 

A2S,T36A,C143S,L173H,P201L,K204E according to self- 
numbering. 

80 90 100 110 120 

TCTGGTTTGTTGAAAGAAAGTATGCGCATCAAGATGTACAT 
SGLLKESMRIKMYM 

130 140 150 160 170 180 

GGAAGGCACGGTTAATGGCCATTATTTCAAGTGTGAAGGAGAGGGAGACGGCAACCCATT 
EGTVNGHYFKCEGEGDGNPF 

190 200 210 220 230 240 

TGCAGGTACGCAGAGCATGAGGATTCATGTCACCGAAGGGGCTCCATTACCATTTGCCTT 
AGTQSMRIHVTEGAPLPFA F 

250 260 270 280 290 300 

CGACATTTTGGCACCGTGTTGTGAGTACGGCAGCAGGACCTTTGTCCACCATACGGCAGA 
DILAPCCEYGSRTFVHHTAE 

310 320 330 340 350 360 

GATTCCCGATTTCTTCAAGCAGTCTTTCCCTGAAGGCTTTACTTGGGAAAGAACCACAAC 
IPDFFKQSFPEGFTWERTTT 

370 380 390 400 410 420 

CTATGAAGATGGAGGCATTCTTACTGCTCATCAGGACACAAGCCTGGAGGGGAACTGCCT 
YEDGGI LTAHQDTSLEGNCL 

430 440 450 460 470 480 

TATATACAAGGTGAAAGTCCTTGGTACCAATTTTCCTGCTGATGGCCCCGTGATGAAGAA 
IYKVKVLGTNFPADGPVMKN 

490 500 510 520 530 540 

CAAATCAGGAGGATGGGAGCCAAGCACTGAGGTGGTTTATCCAGAGAATGGTGTCCTGTG 
KSGGWEPSTEVVYPENGVLC 

550 560 570 580 590 600 

TGGACGTAATGTGATGGCCCTTAAAGTCGGTGATCGTCGTTTGATCTGCCATCACTATAC 
GRNVMALKVGDRRLI CHHYT 

610 620 630 640 650 660 

TTCTTACAGGTCCAAGAAAGCAGTCCGTGCCTTGACAATGCCAGGATTTCATTTTACAGA 
SYRSKKAVRALTMPGFHFTD 

670 680 690 700 710 720 

CATCCGCCTTCAGATGCTGAGGAAAGAGAAAGACGAGTACTTTGAACTGTACGAAGCATC 
IRLQMLRKEKDEYFELYEAS 

730 740 750 760 

TGTGGCTAGGTACAGTGATCTTCCTGAAAAAGCAAATTGA 
VARYSDLPEKAN* (SEQ ID NOs : 07 & 08) 



Figure 7A 
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Figure 7B 
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Figure 8 

Crispa 44-6 mutant possesses six amino acid substitutions vs. wild type: 
A2S,T36A,A65E,C143S,L173H,P201L. 

TCTGGTTTGTTGAAAGAAAGTATGCGCATCAAGATGTACAT 
SGLLKESMRIKMYM 

GGAAGGCACGGTTAATGGCCATTATTTCAAGTGTGAAGGAGAGGGAGACGGCAACCCATT 
EGTVNGHYFKCEGEGDGNPF 

TGCAGGTACGCAGAGCATGAGGATTCATGTCACCGAAGGGGCTCCATTACCATTTGCCTT 
AGTQSMRIHVTEGAPLPFAF 

CGACATTTTGGCACCGTGTTGTGCGTACGGCAGCAGGACCTTTGTCCACCATACGGCAGA 
DILAPCCAYGSRTFVHHTAE 

GATTCCCGATTTCTTCAAGCAGTCTTTCCCTGAAGGCTTTACTTGGGAAAGAACCACAAC 
IPDFFKQS FPEGFTWERTTT 

CTATGAAGATGGAGGCATTCTTACTGCTCATCAGGACACAAGCCTGGAGGGGAACTGCCT 
YEDGGI LTAHQDTSLEGNCL 

< 

TATATACAAGGTGAAAGTCCTTGGTACCAATTTTCCTGCTGATGGCCCCGTGATGAAGAA 
IYKVKVLGTNFPADGPVMKN 

CAAATCAGGAGGATGGGAGCCAAGCACTGAGGTGGTTTATCCAGAGAATGGTGTCCTGTG 
KSGGWEPS TEVVY PENGVLC 



TGGACGTAATGTGATGGCCCTTAAAGTCGGTGATCGTCGTTTGATCTGCCATCACTATAC 
GRNVMALKVGDRRLI CHHYT 

TTCTTACAGGTCCAAGAAAGCAGTCCGTGCCTTGACAATGCCAGGATTTCATTTTACAGA 
SYRSKKAVRALTMPGFHFTD 



CATCCGCCTTCAGATGCTGAGGAAAGAGAAAGACGAGTACTTTGAACTGTACGAAGCATC 
IRLQMLRKEKDEYFELYEAS 

TGTGGCTAGGTACAGTGATCTTCCTGAAAAAGCAAATTGA 
VARYSDLPEKAN* 



(SEQ ID NO: 09 & 10) 



Figure 10 



The amino acid sequence of FP10-cr1 is: 
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(SEQ ID 



NO:12) 

A "humanized" nt sequence encoding the above cr-1 mutant is: 



ATGGTGAGCGGCCTGCTGAAGGAGAGTATGCGCATCAAGATGTACATGGAGGGCACCGTGAACGGCCAC 
TACTTCAAGTGCGAGGGCGAGGGCGACGGCAACCCCTTCGCCGGCACCCAGAGCATGAGAATCCACGTG 
ACCGAGGGCGCCCCCCTGCCCTTCGCCTTCGACATCCTGGCCCCCTGCTGCGAGTACGGCAGCAGGACC 
TTCGTGCACCACACCGCCGAGATCCCCGACTTCTTCAAGCAGAGCTTCCCCGAGGGCTTCACCTGGGAG 
AGAACCACCACCTACGAGGACGGCGGCATCCTGACCGCCCACCAGGACACCAGCCTGGAGGGCAACTGC 

0 

CTGATCTACAAGGTGAAGGTGCACGGCACCAACTTCCCCGCCGACGGCCCCGTGATGAAGAACAAGAGC 
GGCGGCTGGGAGCCCAGCACCGAGGTGGTGTACCCCGAGAACGGCGTGCTGTGCGGCCGGAACGTGATG 
GCCCTGAAGGTGGGCGACCGGCACCTGATCTGCCACCACTACACCAGCTACCGGAGCAAGAAGGCCGTG 
CGCGCCCTGACCATGCCCGGCTTCCACTTCACCGACATCCGGCTCCAGATGCTGCGGAAGAAGAAGGAC 
GAGTACTTCGAGCTGTACGAGGCCAGCGTGGCCCGGTACAGCGACCTGCCCGAGAAGGCCAACTGA 
(SEQ ID NO:ll) 



Alternative crl amino acid sequence 

MSGLLKESMRIKMYMEGTVNGHYFKCEGEGDGNPFAGTQSMRIHVTEGAPLPFAFDILAPCCEYGSRTF 
VHHTAEIPDFFKQSFPEGFTWERTTTYEDGGILTAHQDTSLEGNCLIYKVKVHGTNFPADGPVMKNKSG 
GWEPSTEVVYPENGVLCGRNVMALKVGDRRLICHHYTSYRSKKAVRALTMPGFHFTDIRLQMLRKEKDE 
YFELYEASVARYSDLPEKAN* SEQ ID NO: 14) 



Amino acid sequence encoding above alternative sequence: 

ATGGTGAGCGGCCTGCTGAAGGAGAGCATGCGCATCAAGATGTACATGGAGGGCACCGTGAACGGCCAC 
TACTTCAAGTGCGAGGGCGAGGGCGACGGCAACCCCTTCGCCGGCACCCAGAGCATGCGGATCCACGTG 
ACCGAGGGCGCCCCCCTGCCCTTCGCCTTCGACATCCTGGCCCCCTGCTGCGAGTACGGCAGCAGGACC 
TTCGTGCACCACACCGCCGAGATCCCCGACTTCTTCAAGCAGAGCTTCCCCGAGGGCTTCACCTGGGAG 
AGAACCACCACCTACGAGGACGGCGGCATCCTGACCGCCCACCAGGACACCAGCCTGGAGGGCAACTGC 
CTGATCTACAAGGTGAAGGTGCTGGGCACCAACTTCCCCGCCGACGGCCCCGTGATGAAGAACAAGAGC 
GGCGGCTGGGAGCCCAGCACCGAGGTGGTGTACCCCGAGAACGGCGTGCTGTGCGGCCGGAACGTGATG 
GCCCTGAAGGTGGGCGACCGGCGGCTGATCTGCCACCACTACACCAGCTACCGGAGCAAGAAGGCCGTG 
CGGGCCCTGACCATGCCCGGCTTCCACTTCACCGACATCCGGCTGCAGATGCTGCGGAAGGAGAAGGAC 
GAGTACTTCGAGCTGTACGAGGCCAGCGTGGCCCGGTACAGCGACCTGCCCGAGAAGGCCAACTGA 
(SEQ ID NO: 13) 




Figure 1 1 




FIGURE 12 



Cr- 44 9 -tandem (4 -amino acid linker between monomers is in double underline) . 



1 A CCG GT C GCC ACC ATG GTG AGC GGC CTG CTG AAG GAG AGC ATG CGC 46 

1 Agel MVSGLLKESMR 11 

47 ATC AAG ATG TAC ATG GAG GGC ACC GTG AAC GGC CAC TAC TTC AAG TGC 94 

12 I KMYMEGTVNGHYFKC 27 

95 GAG GGC GAG GGC GAC GGC AAC CCC TTC GCC GGC ACC CAG AGC ATG CGG 142 

28EGEGDGNPFAGTQSMR 43 

143 ATC CAC GTG ACC GAG GGC GCC CCC CTG CCC TTC GCC TTC GAC ATC CTG 190 

44 IHVTEGAPLPFAFDIL 59 

191 GCC CCC TGC TGC GAG TAC GGC AGC AGG ACC TTC GTG CAC CAC ACC GCC 23 8 

60APCCEYGSRTFVHHTA 75 

23 9 GAG ATC CCC GAC TTC TTC AAG CAG AGC TTC CCC GAG GGC TTC ACC TGG 2 86 

76 EI PDFFKQSFPEGFTW 91 

28 7 GAG AGA ACC ACC ACC TAC GAG GAC GGC GGC ATC CTG ACC GCC CAC CAG 334 

92ERTTTYEDGGILTAHQ 107 

33 5 GAC ACC AGC CTG GAG GGC AAC TGC CTG ATC TAC AAG GTG AAG GTG CTG 382 

108 DTSLEGNCLIYKVKVL 123 

3 83 GGC ACC AAC TTC CCC GCC GAC GGC CCC GTG ATG AAG AAC AAG AGC GGC 43 0 

124 GTNFPADGPVMKNKSG 139 

431 GGC TGG GAG CCC AGC ACC GAG GTG GTG TAC CCC GAG AAC GGC GTG CTG 478 

140 GWEPSTEVVYPENGVL 155 

47 9 TGC GGC CGG AAC GTG ATG GCC CTG AAG GTG GGC GAC CGG CGG CTG ATC 526 

156 CGRNVMALKVGDRRLI 171 

52 7 TGC CAC CAC TAC ACC AGC TAC CGG AGC AAG AAG GCC GTG CGG GCC CTG 5 74 

172 CHHYTSYRSKKAVRAL 187 

575 ACC ATG CCC GGC TTC CAC TTC ACC GAC ATC CGG CTG CAG ATG CTG CGG 622 

188 TMPGFHFTDIRLQMLR 203 

623 AAG GAG AAG GAC GAG TAC TTC GAG CTG TAC GAG GCC AGC GTG GCC CGG 6 70 

204 KEKDEYFELYEASVAR 219 

671 TAC AGC GAC CTG CCC GAG AAG GCC AAC AGA TCT CCC GGG ATG GTG AGC 718 

220 YSDLPEKANRSPGMVS 235 

719 GGC CTG CTG AAG GAG AGC ATG CGC ATC AAG ATG TAC ATG GAG GGC ACC 766 

236 GLLKESMRIKMYMEGT 251 

76 7 GTG AAC GGC CAC TAC TTC AAG TGC GAG GGC GAG GGC GAC GGC AAC CCC 814 

252 VNGHYFKCEGEGDGNP 267 

815 TTC GCC GGC ACC CAG AGC ATG CGG ATC CAC GTG ACC GAG GGC GCC CCC 862 

268 FAGTQSMRIHVTEGAP 283 



863 
284 



CTG CCC TTC GCC TTC GAC ATC CTG GCC CCC TGC TGC GAG TAC GGC AGC 
LPFAFDILAPCCEYGS 



910 
299 



Figure 12 (continued) 

911 AGG ACC TTC GTG CAC CAC ACC GCC GAG ATC CCC GAC TTC TTC AAG CAG 
300 RTFVHHTAE IPDFFKQ 



958 
315 



959 AGC TTC CCC GAG GGC TTC ACC TGG 
316 SFPEGFTW 

1007 GGC GGC ATC CTG ACC GCC CAC CAG 
332 GGILTAHQ 

1055 CTG ATC TAC AAG GTG AAG GTG CTG 
348 LIYKVKVL 

1103 CCC GTG ATG AAG AAC AAG AGC GGC 
364 PVMKNKSG 

1151 GTG TAC CCC GAG AAC GGC GTG CTG 
380 VYPENGVL 



GAG AGA ACC ACC ACC TAC GAG GAC 1006 

ERTTTYED 331 

GAC ACC AGC CTG GAG GGC AAC TGC 1054 

DTSLEGNC 347 

GGC ACC AAC TTC CCC GCC GAC GGC 1102 

GTNFPADG 363 

GGC TGG GAG CCC AGC ACC GAG GTG 1150 

GWEPSTEV 379 

TGC GGC CGG AAC GTG ATG GCC CTG 1198 

CGRNVMAL 395 



■« t 



;5 5 



1199 AAG GTG GGC GAC CGG CGG CTG ATC TGC CAC CAC TAC ACC AGC TAC CGG 1246 

396 K.VGDRRLI CHHYTSYR 411 

124 7 AGC AAG AAG GCC GTG CGG GCC CTG ACC ATG CCC GGC TTC CAC TTC ACC 12 94 

412 SKKAVRALTMPGFHFT 427 

1295 GAC ATC CGG CTG CAG ATG CTG CGG AAG GAG AAG GAC GAG TAC TTC GAG 1342 

428 DIRLQMLRKEKDEYFE 443 

1343 CTG TAC GAG GCC AGC GTG GCC CGG TAC AGC GAC CTG CCC GAG AAG GCC 13 90 

444 LYEASVARYSDLPEKA 459 

1391 AAC TGA 

460 N * 



(SEQ ID NOS. 15 & 16) 



Figure 13 



Cr-449- tandem-actin (4-amino acid linker between Cr-449 monomers is noted in 
double underline; 4-amino acid linker between second Cr-44 9 and actin is noted 
in dashed underline) . 



• 3 



1 A CCG GT C GCC ACC ATG GTG AGC GGC CTG CTG AAG GAG AGC ATG CGC 46 

1 Agel MVSGLLKESMR 11 

4 7 ATC AAG ATG TAC ATG GAG GGC ACC GTG AAC GGC CAC TAC TTC AAG TGC 94 

12 IKMYMEGTVNGHYFKC 27 

95 GAG GGC GAG GGC GAC GGC AAC CCC TTC GCC GGC ACC CAG AGC ATG CGG 142 

28EGEGDGNPFAGTQSMR 43 

143 ATC CAC GTG ACC GAG GGC GCC CCC CTG CCC TTC GCC TTC GAC ATC CTG 190 

44 IHVTEGAPLPFAFDIL 59 

191 GCC CCC TGC TGC GAG TAC GGC AGC AGG ACC TTC GTG CAC CAC ACC GCC 238 

60APCCEYGSRTFVHHTA 75 

239 GAG ATC CCC GAC TTC TTC AAG CAG AGC TTC CCC GAG GGC TTC ACC TGG 286 

76EIPDFFKQSFPEGFTW 91 



^ 287 GAG AGA ACC ACC ACC TAC GAG GAC GGC GGC ATC CTG ACC GCC CAC CAG 334 

92ERTTTYEDGGI LTAHQ 107 



^ 335 GAC ACC AGC CTG GAG GGC AAC TGC CTG ATC TAC AAG GTG AAG GTG CTG 382 

* 108 DTSLEGNCLIYKVKVL 123 



383 GGC ACC AAC TTC CCC GCC GAC GGC CCC GTG ATG AAG AAC AAG AGC GGC 430 

124 GTNFPADGPVMKNKSG 139 

431 GGC TGG GAG CCC AGC ACC GAG GTG GTG TAC CCC GAG AAC GGC GTG CTG 478 

140 GWEP STEVVYPENGVL 155 

479 TGC GGC CGG AAC GTG ATG GCC CTG AAG GTG GGC GAC CGG CGG CTG ATC 526 

156 CGRNVMALKVGDRRLI 171 

527 TGC CAC CAC TAC ACC AGC TAC CGG AGC AAG AAG GCC GTG CGG GCC CTG 574 

172 CHHYTSYRSKKAVRAL 187 

575 ACC ATG CCC GGC TTC CAC TTC ACC GAC ATC CGG CTG CAG ATG CTG CGG 622 

188 TMPGFHFTDIRLQMLR 203 

623 AAG GAG AAG GAC GAG TAC TTC GAG CTG TAC GAG GCC AGC GTG GCC CGG 670 

204 KEKDEYFELYEASVAR 219 

6 71 TAC AGC GAC CTG CCC GAG AAG GCC AAC AGA TCT CCC GGG ATG GTG AGC 718 

220 YSDLPEKANRS PGMVS 235 

719 GGC CTG CTG AAG GAG AGC ATG CGC ATC AAG ATG TAC ATG GAG GGC ACC 766 

236 GLLKESMRIKMYMEGT 251 

767 GTG AAC GGC CAC TAC TTC AAG TGC GAG GGC GAG GGC GAC GGC AAC CCC 814 

252 VNGHYFKCEGEGDGNP 267 



815 
268 



TTC GCC GGC ACC CAG AGC ATG CGG ATC CAC GTG ACC GAG GGC GCC CCC 
FAGTQSMRIHVTEGAP 



862 
283 



Figure 13 (continued) 

863 CTG CCC TTC GCC TTC GAC ATC 

284 L P F A F D I 

911 AGG ACC TTC GTG CAC CAC ACC 

300 R T F V H H T 

959 AGC TTC CCC GAG GGC TTC ACC 

316 S F P E G F T 

1007 GGC GGC ATC CTG ACC GCC CAC 

332 G G I L T A H 

1055 CTG ATC TAC AAG GTG AAG GTG 

348 L I Y K V K V 

1103 CCC GTG ATG AAG AAC AAG AGC 

364 P V M K N K S 

1151 GTG TAC CCC GAG AAC GGC GTG 

380 V Y P E N G V 

1199 AAG GTG GGC GAC CGG CGG CTG 

396 K V G D R R L 

1247 AGC AAG AAG GCC GTG CGG GCC 

412 S K K A V R A 

1295 GAC ATC CGG CTG CAG ATG CTG 

428 D I R L Q M L 

134 3 CTG TAC GAG GCC AGC GTG GCC 

444 L Y E A S V A 

13 91 AAC AGA ACT CGA GCT ATG GAT 

460 N R T R A M D 

(SEQ ID NOS . 17 & 18) . 



CTG GCC CCC TGC TGC GAG TAC GGC AGC 
LAPCCEYGS 

GCC GAG ATC CCC GAC TTC TTC AAG CAG 
AEIPDFFKQ 

TGG GAG AGA ACC ACC ACC TAC GAG GAC 
WERTTTYED 

CAG GAC ACC AGC CTG GAG GGC AAC TGC 
QDTSLEGNC 

CTG GGC ACC AAC TTC CCC GCC GAC GGC 
LGTNFPADG 

GGC GGC TGG GAG CCC AGC ACC GAG GTG 
GGWEPSTEV 

CTG TGC GGC CGG AAC GTG ATG GCC CTG 
LCGRNVMAL 

ATC TGC CAC CAC TAC ACC AGC TAC CGG 
ICHHYTSYR 

CTG ACC ATG CCC GGC TTC CAC TTC ACC 
LTMPGFHFT 

CGG AAG GAG AAG GAC GAG TAC TTC GAG 
RKEKDEYFE 

CGG TAC AGC GAC CTG CCC GAG AAG GCC 
RYSDLPEKA 

GAT GAT ATC GCC G. . . 

D D I A. . . 
actin 
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Figure 15 

Heteractis crispa chromoprotein hcCP mut C148S 



C148S according to GFP numbering 
C143S according to self-numbering. 

ATGGCTGGTTTGTTGAAAGAAAGTATGCGCATCAAGATGTACAT 
MAGLLKESMRIKMYM 

GGAAGGCACGGTTAATGGCCATTATTTCAAGTGTGAAGGAGAGGGAGACGGCAACCCATT 
EGTVNGHYFKCEGEGDGNPF 
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TACAGGTACGCAGAGCATGAGGATTCATGTCACCGAAGGGGCTCCATTACCATTTGCCTT 
TGTQSMRIHVTEGAPLPFAF 

CGACATTTTGGCACCGTGTTGTGAGTACGGCAGCAGGACCTTTGTCCACCATACGGCAGA 
DILAPCCEYGSRTFVHHTAE 

GATTCCCGATTTCTTCAAGCAGTCTTTCCCTGAAGGCTTTACTTGGGAAAGAACCACAAC 
IPDFFKQSFPEGFTWERTTT 



\ CTATGAAGATGGAGGCATTCTTACTGCTCATCAGGACACAAGCCTGGAGGGGAACTGCCT 



fn YEDGGI LTAHQDTSLEGNCL 



TATATACAAGGTGAAAGTCCTTGGTACCAATTTTCCTGCTGATGGCCCCGTGATGAAGAA 
IYKVKVLGTNFPADG PVMKN 

f CAAATCAGGAGGATGGGAGCCAAGCACTGAGGTGGTTTATCCAGAGAATGGTGTCCTGTG 

KSGGWEPSTEVVYPENGVLC 

j » 

U TGGACGTAATGTGATGGCCCTTAAAGTCGGTGATCGTCGTTTGATCTGCCATCTCTATAC 

GRNVMALKVGDRRL I CHLYT 

TTCTTACAGGTCCAAGAAAGCAGTCCGTGCCTTGACAATGCCAGGATTTCATTTTACAGA 
SYRSKKAVRALTMPGFHFTD 

CATCCGCCTTCAGATGCCGAGGAAAAAGAAAGACGAGTACTTTGAACTGTACGAAGCATC 
IRLQMPRKKKDEYFELYEAS 



TGTGGCTAGGTACAGTGATCTTCCTGAAAAAGCAAATTGA 
VARYSDLPEKAN* 
SEQ ID NO: 23 & 24 



Figure 16 



Crispa 44-6 mutant possesses six amino acid substitutions vs. wild type: 
A2S,T36A,A65E,C 1 43S,L 1 73H,P20 1 L. 

TCTGGTTTGTTGAAAGAAAGTATGCGCATCAAGATGTACAT 
SGLLKESMRIKMYM 

GGAAGGCACGGTTAATGGCCATTATTTCAAGTGTGAAGGAGAGGGAGACGGCAACCCATT 
EGTVNGHYFKCEGEGDGNPF 

TGCAGGTACGCAGAGCATGAGGATTCATGTCACCGAAGGGGCTCCATTACCATTTGCCTT 
AGTQSMR IHVTEGAPLPFAF 

CGACATTTTGGCACCGTGTTGTGCGTACGGCAGCAGGACCTTTGTCCACCATACGGCAGA 
DILAPCCAYGSRTFVHHTAE 

GATTCCCGATTTCTTCAAGCAGTCTTTCCCTGAAGGCTTTACTTGGGAAAGAACCACAAC 
IPDFFKQSFPEGFTWERTTT 

CTATGAAGATGGAGGCATTCTTACTGCTCATCAGGACACAAGCCTGGAGGGGAACTGCCT 
YEDGGILTAHQDTSLEGNCL 

< 

TATATACAAGGTGAAAGTCCTTGGTACCAATTTTCCTGCTGATGGCCCCGTGATGAAGAA 
IYKVKVLGTNFPADGPVMKN 

CAAATCAGGAGGATGGGAGCCAAGCACTGAGGTGGTTTATCCAGAGAATGGTGTCCTGTG 
KSGGWEPSTEVVYPENGVLC 



TGGACGTAATGTGATGGCCCTTAAAGTCGGTGATCGTCGTTTGATCTGCCATCACTATAC 
GRNVMALKVGDRRL I CHHYT 

TTCTTACAGGTCCAAGAAAGCAGTCCGTGCCTTGACAATGCCAGGATTTCATTTTACAGA 
SYRSKKAVRALTMPGFHFTD 



CATCCGCCTTCAGATGCTGAGGAAAAAGAAAGACGAGTACTTTGAACTGTACGAAGCATC 
IRLQMLRKKKDEYFELYEAS 

TGTGGCTAGGTACAGTGATCTTCCTGAAAAAGCAAATTGA 

VARYSDLPEKAN* 
SEQ ID NO: 25 & 26 



Figure 17 



Heteractis crispa chromoprotein wild type (base isoform) 

10 20 30 40 50 60 

5 » ACCATTTGCTTTGGTTCCTTGGCAAACGAAAGTTTAGAACGAAAACTGACCCAAATTACA 
70 80 90 100 110 120 

TCTTCCTCCTGGATCCTTACCATGGCTGGTTTGTTGAAAGAAAGTATGCGCATCAAGATG 

MAGLLKESMRI KM 

130 140 150 160 170 180 

TACATGGAAGGCACGGTTAATGGCCATTATTTCAAGTGTGAAGGAGAGGGAGACGGCAAC 
YMEGTVNGHYFKCEGEGDGN 

190 200 210 220 230 240 

CCATTTACAGGTACGCAGAGCATGAGGATTCATGTCACCGAAGGGGCTCCATTACCATTT 
PFTGTQSMRIHVTEGAPLPF 

250 260 270 280 290 300 

GCCTTCGACATTTTGGCACCGTGTTGTGAGTACGGCAGCAGGACCTTTGTCCACCATACG 
AFDILAPCCEYGSR TFVHHT 

310 320 330 340 350 360 

GCAGAGATTCCCGATTTCTTCAAGCAGTCTTTCCCTGAAGGCTTTACTTGGGAAAGAACC 
AEIPDFFKQSFPEGFTWERT 

370 380 390 400 410 420 

ACAACCTATGAAGATGGAGGCATTCTTACTGCTCATCAGGACACAAGCCTGGAGGGGAAC 
TTYEDGGILTAHQDTSLEGN 

430 440 450 460 470 480 

TGCCTTATATACAAGGTGAAAGTCCTTGGTACCAATTTTCCTGCTGATGGCCCCGTGATG 
CLIYKVKVLGTNFPADGPVM 

490 500 510 520 530 540 

AAGAACAAATCAGGAGGATGGGAGCCATGCACTGAGGTGGTTTATCCAGAGAATGGTGTC 
KNKSGGWEPCTEVVYPENGV 

550 560 570 580 590 600 

CTGTGTGGACGTAATGTGATGGCCCTTAAAGTCGGTGATCGTCGTTTGATCTGCCATCTC 
LCGRNVMALKVGDRRLICHL 

610 620 630 640 650 660 

TATACTTCTTACAGGTCCAAGAAAGCAGTCCGTGCCTTGACAATGCCAGGATTTCATTTT 
YTSYRS KKAVRALTMPGFHF 

670 680 690 700 710 720 

ACAGACATCCGCCTTCAGATGCCGAGGAAAACGAAAGACGAGTACTTTGAACTGTACGAA 
TDIRLQMPRKTKDEYFELYE 

730 740 750 760 770 780 

GCATCTGTGGCTAGGTACAGTGATCTTCCTGAAAAAGCAAATTGATTGTTCCCAGTGACA 
ASVARYSDLPEKAN* 

790 800 810 820 830 840 

CCAGACTGCTGTCAGCTTTTGGTTAAAGCCCGAAAGACAAAAGGACATTTGTAGTTTAGT 

850 860 870 880 890 900 

TTATATTTCCCTTTCATTTGTGAATCAACATTGTACTCTCTGTAAACCTTTAAAATGCTC 

910 

CATTAAACCT 3' (SEQ ID NOs : 27 & 28) 



